KD-Tree Based Clustering for Gene Expression Data
نویسندگان
چکیده
K-means is one of the widely researched clustering algorithms. But it is sensitive to the selection of initial cluster centers and estimation of the number of clusters. In this chapter, we propose a novel approach to find the efficient initial cluster centers using kd-tree and compute the number of clusters using joint distance function. We have carried out excessive experiments on various synthetic as well as gene expression data. Dunn validity index is used to examine the quality of the clusters in case of multi-dimensional gene expression data. The experimental results are compared with the existing techniques using the Dunn validity index and number of iterations.
منابع مشابه
Modification of the Fast Global K-means Using a Fuzzy Relation with Application in Microarray Data Analysis
Recognizing genes with distinctive expression levels can help in prevention, diagnosis and treatment of the diseases at the genomic level. In this paper, fast Global k-means (fast GKM) is developed for clustering the gene expression datasets. Fast GKM is a significant improvement of the k-means clustering method. It is an incremental clustering method which starts with one cluster. Iteratively ...
متن کاملخوشهبندی دادههای بیانژنی توسط عدم تشابه جنگل تصادفی
Background: The clustering of gene expression data plays an important role in the diagnosis and treatment of cancer. These kinds of data are typically involve in a large number of variables (genes), in comparison with number of samples (patients). Many clustering methods have been built based on the dissimilarity among observations that are calculated by a distance function. As increa...
متن کاملAnt-MST: An Ant-Based Minimum Spanning Tree for Gene Expression Data Clustering
We have proposed an ant-based clustering algorithm for document clustering based on the travelling salesperson scenario. In this paper, we presented an approach called Ant-MST for gene expression data clustering based on both ant-based clustering and minimum spanning trees (MST). The ant-based clustering algorithm is firstly used to construct a fully connected network of nodes. Each node repres...
متن کاملSCM-driven Tree View for Microarray Data
Eisen’s tree view is a useful tool for clustering and displaying of microarray gene expression data. In Eisen’s tree view system, a hierarchical method is used for clustering data. However, some useful information in gene expression data may not be well drawn when a hierarchical clustering is directly used in Eisen’s tree view. In this paper, we embed the similarity-based clustering method (SCM...
متن کاملGene Expression Data Clustering and Visualization Based on a Binary Hierarchical Clustering Framework
We describe the use of a binary hierarchical clustering (BHC) framework for clustering of gene expression data. The BHC algorithm involves two major steps. Firstly, the K-means algorithm is used to split the data into two classes. Secondly, the Fisher criterion is applied to the classes to assess whether the splitting is acceptable. The algorithm is applied to the sub-classes recursively and en...
متن کامل